仇恨言论等攻击性内容的广泛构成了越来越多的社会问题。 AI工具是支持在线平台的审核过程所必需的。为了评估这些识别工具,需要与不同语言的数据集进行连续实验。 HASOC轨道(仇恨语音和冒犯性内容识别)专用于为此目的开发基准数据。本文介绍了英语,印地语和马拉地赛的Hasoc Subtrack。数据集由Twitter组装。此子系统有两个子任务。任务A是为所有三种语言提供的二进制分类问题(仇恨而非冒犯)。任务B是三个课程(仇恨)仇恨言论,令人攻击和亵渎为英语和印地语提供的细粒度分类问题。总体而言,652名队伍提交了652次。任务A最佳分类算法的性能分别为Marathi,印地语和英语的0.91,0.78和0.83尺寸。此概述介绍了任务和数据开发以及详细结果。提交竞争的系统应用了各种技术。最好的表演算法主要是变压器架构的变种。
translated by 谷歌翻译
Knowledge distillation is often used to transfer knowledge from a strong teacher model to a relatively weak student model. Traditional knowledge distillation methods include response-based methods and feature-based methods. Response-based methods are used the most widely but suffer from lower upper limit of model performance, while feature-based methods have constraints on the vocabularies and tokenizers. In this paper, we propose a tokenizer-free method liberal feature-based distillation (LEAD). LEAD aligns the distribution between teacher model and student model, which is effective, extendable, portable and has no requirements on vocabularies, tokenizer, or model architecture. Extensive experiments show the effectiveness of LEAD on several widely-used benchmarks, including MS MARCO Passage, TREC Passage 19, TREC Passage 20, MS MARCO Document, TREC Document 19 and TREC Document 20.
translated by 谷歌翻译
This paper presents E5, a family of state-of-the-art text embeddings that transfer well to a wide range of tasks. The model is trained in a contrastive manner with weak supervision signals from our curated large-scale text pair dataset (called CCPairs). E5 can be readily used as a general-purpose embedding model for any tasks requiring a single-vector representation of texts such as retrieval, clustering, and classification, achieving strong performance in both zero-shot and fine-tuned settings. We conduct extensive evaluations on 56 datasets from the BEIR and MTEB benchmarks. For zero-shot settings, E5 is the first model that outperforms the strong BM25 baseline on the BEIR retrieval benchmark without using any labeled data. When fine-tuned, E5 obtains the best results on the MTEB benchmark, beating existing embedding models with 40x more parameters.
translated by 谷歌翻译
Labeling a module defective or non-defective is an expensive task. Hence, there are often limits on how much-labeled data is available for training. Semi-supervised classifiers use far fewer labels for training models, but there are numerous semi-supervised methods, including self-labeling, co-training, maximal-margin, and graph-based methods, to name a few. Only a handful of these methods have been tested in SE for (e.g.) predicting defects and even that, those tests have been on just a handful of projects. This paper takes a wide range of 55 semi-supervised learners and applies these to over 714 projects. We find that semi-supervised "co-training methods" work significantly better than other approaches. However, co-training needs to be used with caution since the specific choice of co-training methods needs to be carefully selected based on a user's specific goals. Also, we warn that a commonly-used co-training method ("multi-view"-- where different learners get different sets of columns) does not improve predictions (while adding too much to the run time costs 11 hours vs. 1.8 hours). Those cautions stated, we find using these "co-trainers," we can label just 2.5% of data, then make predictions that are competitive to those using 100% of the data. It is an open question worthy of future work to test if these reductions can be seen in other areas of software analytics. All the codes used and datasets analyzed during the current study are available in the https://GitHub.com/Suvodeep90/Semi_Supervised_Methods.
translated by 谷歌翻译
The increasing reliance on online communities for healthcare information by patients and caregivers has led to the increase in the spread of misinformation, or subjective, anecdotal and inaccurate or non-specific recommendations, which, if acted on, could cause serious harm to the patients. Hence, there is an urgent need to connect users with accurate and tailored health information in a timely manner to prevent such harm. This paper proposes an innovative approach to suggesting reliable information to participants in online communities as they move through different stages in their disease or treatment. We hypothesize that patients with similar histories of disease progression or course of treatment would have similar information needs at comparable stages. Specifically, we pose the problem of predicting topic tags or keywords that describe the future information needs of users based on their profiles, traces of their online interactions within the community (past posts, replies) and the profiles and traces of online interactions of other users with similar profiles and similar traces of past interaction with the target users. The result is a variant of the collaborative information filtering or recommendation system tailored to the needs of users of online health communities. We report results of our experiments on an expert curated data set which demonstrate the superiority of the proposed approach over the state of the art baselines with respect to accurate and timely prediction of topic tags (and hence information sources of interest).
translated by 谷歌翻译
We present a retrospective on the state of Embodied AI research. Our analysis focuses on 13 challenges presented at the Embodied AI Workshop at CVPR. These challenges are grouped into three themes: (1) visual navigation, (2) rearrangement, and (3) embodied vision-and-language. We discuss the dominant datasets within each theme, evaluation metrics for the challenges, and the performance of state-of-the-art models. We highlight commonalities between top approaches to the challenges and identify potential future directions for Embodied AI research.
translated by 谷歌翻译
知识蒸馏是将知识从强大的教师转移到有效的学生模型的有效方法。理想情况下,我们希望老师越好,学生越好。但是,这种期望并不总是成真。通常,由于教师和学生之间的不可忽略的差距,更好的教师模型通过蒸馏导致不良学生。为了弥合差距,我们提出了一种渐进式蒸馏方法,以进行致密检索。产品由教师渐进式蒸馏和数据进行渐进的蒸馏组成,以逐步改善学生。我们对五个广泛使用的基准,MARCO通道,TREC Passage 19,TREC文档19,MARCO文档和自然问题进行了广泛的实验,其中POD在蒸馏方法中实现了密集检索的最新方法。代码和模型将发布。
translated by 谷歌翻译
由于免费的在线百科全书具有大量内容,因此Wikipedia和Wikidata是许多自然语言处理(NLP)任务的关键,例如信息检索,知识基础构建,机器翻译,文本分类和文本摘要。在本文中,我们介绍了Wikides,这是一个新颖的数据集,用于为文本摘要问题提供Wikipedia文章的简短描述。该数据集由6987个主题上的80K英语样本组成。我们设置了一种两阶段的摘要方法 - 描述生成(I阶段)和候选排名(II阶段)作为一种依赖于转移和对比学习的强大方法。对于描述生成,与其他小规模的预训练模型相比,T5和BART表现出了优越性。通过将对比度学习与Beam Search的不同输入一起应用,基于度量的排名模型优于直接描述生成模型,在主题独立拆分和独立于主题的独立拆分中,最高可达22个胭脂。此外,第II期中的结果描述得到了人类评估的支持,其中45.33%以上,而I阶段的23.66%则支持针对黄金描述。在情感分析方面,生成的描述无法有效地从段落中捕获所有情感极性,同时从黄金描述中更好地完成此任务。自动产生的新描述减少了人类为创建它们的努力,并丰富了基于Wikidata的知识图。我们的论文对Wikipedia和Wikidata产生了实际影响,因为有成千上万的描述。最后,我们预计Wikides将成为从短段落中捕获显着信息的相关作品的有用数据集。策划的数据集可公开可用:https://github.com/declare-lab/wikides。
translated by 谷歌翻译
评论包含有关产品特征和用户兴趣的丰富信息,因此通常用于提高建议系统性能。具体而言,先前的工作表明,共同学习进行审查生成可以改善评级预测性能。同时,这些模型制作的评论是推荐说明,为用户提供了有关预测评分的见解。但是,尽管现有模型可能会产生流利的人类样评论,但尚不清楚评论在多大程度上完全揭示了共同预测的评级背后的理由。在这项工作中,我们执行一系列评估,以探究最先进的模型及其审查生成部分。我们表明,生成的解释是脆弱的,需要进一步评估,然后才能作为估计评级的字面原理。
translated by 谷歌翻译
最近的模型可以产生流利和语法合成评论,同时准确预测用户评分。生成的评论表达了用户对相关产品的估计意见,通常被视为自然语言“理由”,共同预测的评级。但是,先前的研究发现,现有模型通常会产生重复性,普遍适用和通用的解释,从而导致非信息原理。此外,我们的分析表明,以前的模型生成的内容通常包含事实幻觉。这些问题要求采用新颖的解决方案,这些解决方案可以产生信息丰富的和事实扎根的解释。受到最新使用检索内容的启发,除了生成的参数知识外,我们建议用个性化的检索器增强发电机,在该发现者的启发下,猎犬的输出是增强发电机的外部知识。关于Yelp,TripAdvisor和Amazon Movie评论数据集的实验表明,我们的模型可以产生解释,即更可靠地需要进行现有评论,更多样化,并且由人类评估人员评为更有信息。
translated by 谷歌翻译